Overview

Dataset statistics

Number of variables32
Number of observations1980
Missing cells1341
Missing cells (%)2.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory495.1 KiB
Average record size in memory256.1 B

Variable types

CAT20
NUM9
BOOL3

Warnings

Sex has constant value "1980" Constant
Relapse Free Status (Months) is highly correlated with Overall Survival (Months)High correlation
Overall Survival (Months) is highly correlated with Relapse Free Status (Months)High correlation
Tumor Other Histologic Subtype is highly correlated with Cancer Type Detailed and 1 other fieldsHigh correlation
Cancer Type Detailed is highly correlated with Tumor Other Histologic Subtype and 1 other fieldsHigh correlation
Oncotree Code is highly correlated with Cancer Type Detailed and 1 other fieldsHigh correlation
Patient's Vital Status is highly correlated with Overall Survival StatusHigh correlation
Overall Survival Status is highly correlated with Patient's Vital StatusHigh correlation
Type of Breast Surgery has 26 (1.3%) missing values Missing
Cellularity has 64 (3.2%) missing values Missing
ER status measured by IHC has 43 (2.2%) missing values Missing
Neoplasm Histologic Grade has 88 (4.4%) missing values Missing
Tumor Other Histologic Subtype has 44 (2.2%) missing values Missing
Primary Tumor Laterality has 111 (5.6%) missing values Missing
Lymph nodes examined positive has 76 (3.8%) missing values Missing
Mutation Count has 121 (6.1%) missing values Missing
3-Gene classifier subtype has 217 (11.0%) missing values Missing
Tumor Size has 26 (1.3%) missing values Missing
Tumor Stage has 515 (26.0%) missing values Missing
df_index has unique values Unique
Lymph nodes examined positive has 993 (50.2%) zeros Zeros

Reproduction

Analysis started2021-01-27 15:52:31.297950
Analysis finished2021-01-27 15:53:00.865792
Duration29.57 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct1980
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean993.4505051
Minimum0
Maximum1984
Zeros1
Zeros (%)0.1%
Memory size15.5 KiB
2021-01-27T16:53:00.947791image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile99.95
Q1498.75
median993.5
Q31489.25
95-th percentile1885.05
Maximum1984
Range1984
Interquartile range (IQR)990.5

Descriptive statistics

Standard deviation572.7697738
Coefficient of variation (CV)0.576545858
Kurtosis-1.198678198
Mean993.4505051
Median Absolute Deviation (MAD)495.5
Skewness-0.002117657611
Sum1967032
Variance328065.2138
MonotocityStrictly increasing
2021-01-27T16:53:01.119741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
010.1%
 
68310.1%
 
67910.1%
 
67710.1%
 
67510.1%
 
67310.1%
 
67110.1%
 
66910.1%
 
66710.1%
 
66510.1%
 
Other values (1970)197099.5%
 
ValueCountFrequency (%) 
010.1%
 
110.1%
 
210.1%
 
310.1%
 
410.1%
 
ValueCountFrequency (%) 
198410.1%
 
198310.1%
 
198210.1%
 
198110.1%
 
198010.1%
 

Age at Diagnosis
Real number (ℝ≥0)

Distinct1623
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean61.08823737
Minimum21.93
Maximum96.29
Zeros0
Zeros (%)0.0%
Memory size15.5 KiB
2021-01-27T16:53:01.321547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum21.93
5-th percentile38.856
Q151.42
median61.825
Q370.6025
95-th percentile81.149
Maximum96.29
Range74.36
Interquartile range (IQR)19.1825

Descriptive statistics

Standard deviation12.95286156
Coefficient of variation (CV)0.2120352808
Kurtosis-0.5440581195
Mean61.08823737
Median Absolute Deviation (MAD)9.395
Skewness-0.2025744121
Sum120954.71
Variance167.7766227
MonotocityNot monotonic
2021-01-27T16:53:01.526860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
49.6150.3%
 
61.1640.2%
 
64.0140.2%
 
43.0830.2%
 
74.7630.2%
 
60.3330.2%
 
78.1930.2%
 
50.0830.2%
 
67.4630.2%
 
66.9130.2%
 
Other values (1613)194698.3%
 
ValueCountFrequency (%) 
21.9310.1%
 
26.3610.1%
 
26.7210.1%
 
27.5610.1%
 
28.0410.1%
 
ValueCountFrequency (%) 
96.2910.1%
 
92.1410.1%
 
90.4310.1%
 
90.2310.1%
 
90.0810.1%
 

Type of Breast Surgery
Categorical

MISSING

Distinct2
Distinct (%)0.1%
Missing26
Missing (%)1.3%
Memory size15.5 KiB
MASTECTOMY
1170 
BREAST CONSERVING
784 
ValueCountFrequency (%) 
MASTECTOMY117059.1%
 
BREAST CONSERVING78439.6%
 
(Missing)261.3%
 
2021-01-27T16:53:01.749852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:01.863851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:01.976850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length17
Median length10
Mean length12.67979798
Min length3

Cancer Type Detailed
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.5 KiB
Breast Invasive Ductal Carcinoma
1537 
Breast Mixed Ductal and Lobular Carcinoma
211 
Breast Invasive Lobular Carcinoma
 
146
Invasive Breast Carcinoma
 
42
Breast Invasive Mixed Mucinous Carcinoma
 
23
Other values (3)
 
21
ValueCountFrequency (%) 
Breast Invasive Ductal Carcinoma153777.6%
 
Breast Mixed Ductal and Lobular Carcinoma21110.7%
 
Breast Invasive Lobular Carcinoma1467.4%
 
Invasive Breast Carcinoma422.1%
 
Breast Invasive Mixed Mucinous Carcinoma231.2%
 
Breast170.9%
 
Breast Angiosarcoma20.1%
 
Metaplastic Breast Cancer20.1%
 
2021-01-27T16:53:02.131660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:02.295438image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:02.567878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length41
Median length32
Mean length32.73383838
Min length6

Cellularity
Categorical

MISSING

Distinct3
Distinct (%)0.2%
Missing64
Missing (%)3.2%
Memory size15.5 KiB
High
965 
Moderate
737 
Low
214 
ValueCountFrequency (%) 
High96548.7%
 
Moderate73737.2%
 
Low21410.8%
 
(Missing)643.2%
 
2021-01-27T16:53:02.760925image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:02.896756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:03.030755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length4
Mean length5.348484848
Min length3
Distinct2
Distinct (%)0.1%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
NO
1567 
YES
412 
(Missing)
 
1
ValueCountFrequency (%) 
NO156779.1%
 
YES41220.8%
 
(Missing)10.1%
 
2021-01-27T16:53:03.132777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct7
Distinct (%)0.4%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
LumA
699 
LumB
475 
Her2
224 
claudin-low
218 
Basal
209 
Other values (2)
154 
ValueCountFrequency (%) 
LumA69935.3%
 
LumB47524.0%
 
Her222411.3%
 
claudin-low21811.0%
 
Basal20910.6%
 
Normal1487.5%
 
NC60.3%
 
(Missing)10.1%
 
2021-01-27T16:53:03.251662image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:03.386773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:03.587245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length11
Median length4
Mean length5.019191919
Min length2

ER status measured by IHC
Categorical

MISSING

Distinct2
Distinct (%)0.1%
Missing43
Missing (%)2.2%
Memory size15.5 KiB
Positve
1498 
Negative
439 
ValueCountFrequency (%) 
Positve149875.7%
 
Negative43922.2%
 
(Missing)432.2%
 
2021-01-27T16:53:03.749831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:03.850320image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:03.973698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length7
Mean length7.134848485
Min length3

ER Status
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.5 KiB
Positive
1506 
Negative
474 
ValueCountFrequency (%) 
Positive150676.1%
 
Negative47423.9%
 
2021-01-27T16:53:04.126720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:04.244827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:04.362968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length8
Min length8

Neoplasm Histologic Grade
Categorical

MISSING

Distinct3
Distinct (%)0.2%
Missing88
Missing (%)4.4%
Memory size15.5 KiB
3
952 
2
771 
1
169 
ValueCountFrequency (%) 
395248.1%
 
277138.9%
 
11698.5%
 
(Missing)884.4%
 
2021-01-27T16:53:04.524616image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:04.635617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:04.760763image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3
Distinct4
Distinct (%)0.2%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
NEUTRAL
1436 
GAIN
438 
LOSS
 
100
UNDEF
 
5
ValueCountFrequency (%) 
NEUTRAL143672.5%
 
GAIN43822.1%
 
LOSS1005.1%
 
UNDEF50.3%
 
(Missing)10.1%
 
2021-01-27T16:53:04.968791image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:05.099228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:05.247853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length6.177777778
Min length3

HER2 Status
Categorical

Distinct2
Distinct (%)0.1%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
Negative
1732 
Positive
247 
ValueCountFrequency (%) 
Negative173287.5%
 
Positive24712.5%
 
(Missing)10.1%
 
2021-01-27T16:53:05.420222image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:05.536769image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:05.669717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length7.997474747
Min length3

Tumor Other Histologic Subtype
Categorical

HIGH CORRELATION
MISSING

Distinct8
Distinct (%)0.4%
Missing44
Missing (%)2.2%
Memory size15.5 KiB
Ductal/NST
1491 
Mixed
211 
Lobular
 
146
Medullary
 
25
Mucinous
 
23
Other values (3)
 
40
ValueCountFrequency (%) 
Ductal/NST149175.3%
 
Mixed21110.7%
 
Lobular1467.4%
 
Medullary251.3%
 
Mucinous231.2%
 
Tubular/ cribriform211.1%
 
Other170.9%
 
Metaplastic20.1%
 
(Missing)442.2%
 
2021-01-27T16:53:05.819462image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:05.926595image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:06.173661image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length19
Median length10
Mean length9.108080808
Min length3
Distinct2
Distinct (%)0.1%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
YES
1216 
NO
763 
(Missing)
 
1
ValueCountFrequency (%) 
YES121661.4%
 
NO76338.5%
 
(Missing)10.1%
 
2021-01-27T16:53:06.545737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)0.1%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
Post
1555 
Pre
424 
ValueCountFrequency (%) 
Post155578.5%
 
Pre42421.4%
 
(Missing)10.1%
 
2021-01-27T16:53:06.656743image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:06.765873image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:06.879877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length3.785353535
Min length3
Distinct11
Distinct (%)0.6%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
8
299 
3
290 
4ER+
260 
10
226 
5
190 
Other values (6)
714 
ValueCountFrequency (%) 
829915.1%
 
329014.6%
 
4ER+26013.1%
 
1022611.4%
 
51909.6%
 
71899.5%
 
91467.4%
 
11397.0%
 
6854.3%
 
4ER-834.2%
 
2021-01-27T16:53:07.036737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:07.195419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length1
Mean length1.634848485
Min length1

Primary Tumor Laterality
Categorical

MISSING

Distinct2
Distinct (%)0.1%
Missing111
Missing (%)5.6%
Memory size15.5 KiB
Left
973 
Right
896 
ValueCountFrequency (%) 
Left97349.1%
 
Right89645.3%
 
(Missing)1115.6%
 
2021-01-27T16:53:07.370149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:07.480799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:07.595532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length4
Mean length4.396464646
Min length3

Lymph nodes examined positive
Real number (ℝ≥0)

MISSING
ZEROS

Distinct31
Distinct (%)1.6%
Missing76
Missing (%)3.8%
Infinite0
Infinite (%)0.0%
Mean2.00210084
Minimum0
Maximum45
Zeros993
Zeros (%)50.2%
Memory size15.5 KiB
2021-01-27T16:53:07.749759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile10
Maximum45
Range45
Interquartile range (IQR)2

Descriptive statistics

Standard deviation4.079993071
Coefficient of variation (CV)2.03785593
Kurtosis20.73048305
Mean2.00210084
Median Absolute Deviation (MAD)0
Skewness3.839109986
Sum3812
Variance16.64634346
MonotocityNot monotonic
2021-01-27T16:53:07.931914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
099350.2%
 
133216.8%
 
21638.2%
 
31095.5%
 
4542.7%
 
6482.4%
 
5432.2%
 
7241.2%
 
8201.0%
 
14150.8%
 
Other values (21)1035.2%
 
(Missing)763.8%
 
ValueCountFrequency (%) 
099350.2%
 
133216.8%
 
21638.2%
 
31095.5%
 
4542.7%
 
ValueCountFrequency (%) 
4510.1%
 
4110.1%
 
3310.1%
 
3110.1%
 
2610.1%
 

Mutation Count
Real number (ℝ≥0)

MISSING

Distinct30
Distinct (%)1.6%
Missing121
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean5.693921463
Minimum1
Maximum80
Zeros0
Zeros (%)0.0%
Memory size15.5 KiB
2021-01-27T16:53:08.096374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q37
95-th percentile12
Maximum80
Range79
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.055245597
Coefficient of variation (CV)0.7122060997
Kurtosis70.40122434
Mean5.693921463
Median Absolute Deviation (MAD)2
Skewness5.196280579
Sum10585
Variance16.44501685
MonotocityNot monotonic
2021-01-27T16:53:08.255994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%) 
526813.5%
 
424812.5%
 
623311.8%
 
322911.6%
 
21939.7%
 
71688.5%
 
81216.1%
 
11075.4%
 
9904.5%
 
10613.1%
 
Other values (20)1417.1%
 
(Missing)1216.1%
 
ValueCountFrequency (%) 
11075.4%
 
21939.7%
 
322911.6%
 
424812.5%
 
526813.5%
 
ValueCountFrequency (%) 
8010.1%
 
4610.1%
 
4010.1%
 
3010.1%
 
2810.1%
 

Nottingham prognostic index
Real number (ℝ≥0)

Distinct320
Distinct (%)16.2%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean4.010950743
Minimum1
Maximum6.36
Zeros0
Zeros (%)0.0%
Memory size15.5 KiB
2021-01-27T16:53:08.456167image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.034
Q13.044
median4.042
Q35.04
95-th percentile6.06
Maximum6.36
Range5.36
Interquartile range (IQR)1.996

Descriptive statistics

Standard deviation1.16279144
Coefficient of variation (CV)0.2899041933
Kurtosis-0.3053460491
Mean4.010950743
Median Absolute Deviation (MAD)0.998
Skewness-0.1045061481
Sum7937.67152
Variance1.352083932
MonotocityNot monotonic
2021-01-27T16:53:08.626619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.04864.3%
 
3.04733.7%
 
4.05582.9%
 
3.03572.9%
 
4.03572.9%
 
4.06492.5%
 
5.05472.4%
 
5.06472.4%
 
3.05432.2%
 
5.04392.0%
 
Other values (310)142371.9%
 
ValueCountFrequency (%) 
150.3%
 
1.0220.1%
 
1.02210.1%
 
1.02420.1%
 
1.02810.1%
 
ValueCountFrequency (%) 
6.3610.1%
 
6.3210.1%
 
6.310.1%
 
6.2620.1%
 
6.2410.1%
 

Oncotree Code
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.5 KiB
IDC
1537 
MDLC
211 
ILC
 
146
BRCA
 
42
IMMC
 
23
Other values (3)
 
21
ValueCountFrequency (%) 
IDC153777.6%
 
MDLC21110.7%
 
ILC1467.4%
 
BRCA422.1%
 
IMMC231.2%
 
BREAST170.9%
 
MBC20.1%
 
PBS20.1%
 
2021-01-27T16:53:08.804875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:08.950042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:09.169879image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length3
Mean length3.165151515
Min length3

Overall Survival (Months)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1742
Distinct (%)88.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean125.1787374
Minimum0
Maximum355.2
Zeros1
Zeros (%)0.1%
Memory size15.5 KiB
2021-01-27T16:53:09.353903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile19.09666667
Q160.825
median116.45
Q3185.0333333
95-th percentile259.9683334
Maximum355.2
Range355.2
Interquartile range (IQR)124.2083333

Descriptive statistics

Standard deviation76.07507543
Coefficient of variation (CV)0.6077316087
Kurtosis-0.7867700149
Mean125.1787374
Median Absolute Deviation (MAD)61.06666665
Skewness0.3739527572
Sum247853.9
Variance5787.417102
MonotocityNot monotonic
2021-01-27T16:53:09.542769image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
192.240.2%
 
48.4333333330.2%
 
119.466666730.2%
 
98.730.2%
 
96.9666666730.2%
 
43.130.2%
 
19.7333333330.2%
 
187.033333330.2%
 
108.066666730.2%
 
117.666666730.2%
 
Other values (1732)194998.4%
 
ValueCountFrequency (%) 
010.1%
 
0.110.1%
 
0.76666666710.1%
 
1.23333333310.1%
 
1.26666666710.1%
 
ValueCountFrequency (%) 
355.210.1%
 
35110.1%
 
337.033333310.1%
 
335.733333310.1%
 
335.610.1%
 

Overall Survival Status
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.5 KiB
1:DECEASED
1143 
0:LIVING
837 
ValueCountFrequency (%) 
1:DECEASED114357.7%
 
0:LIVING83742.3%
 
2021-01-27T16:53:09.776767image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:09.880767image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:10.007769image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length9.154545455
Min length8

PR Status
Categorical

Distinct2
Distinct (%)0.1%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
Positive
1039 
Negative
940 
ValueCountFrequency (%) 
Positive103952.5%
 
Negative94047.5%
 
(Missing)10.1%
 
2021-01-27T16:53:10.168951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:10.290810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:10.405810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length7.997474747
Min length3
Distinct2
Distinct (%)0.1%
Missing1
Missing (%)0.1%
Memory size15.5 KiB
YES
1172 
NO
807 
(Missing)
 
1
ValueCountFrequency (%) 
YES117259.2%
 
NO80740.8%
 
(Missing)10.1%
 
2021-01-27T16:53:10.530652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Relapse Free Status (Months)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1715
Distinct (%)86.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean110.0013889
Minimum0
Maximum346.38
Zeros4
Zeros (%)0.2%
Memory size15.5 KiB
2021-01-27T16:53:10.649035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10.86
Q141.7325
median100.72
Q3167.4725
95-th percentile251.6165
Maximum346.38
Range346.38
Interquartile range (IQR)125.74

Descriptive statistics

Standard deviation76.26207097
Coefficient of variation (CV)0.6932828007
Kurtosis-0.7362096243
Mean110.0013889
Median Absolute Deviation (MAD)62.04
Skewness0.4799066027
Sum217802.75
Variance5815.903469
MonotocityNot monotonic
2021-01-27T16:53:10.811852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
040.2%
 
32.1140.2%
 
189.6740.2%
 
102.6630.2%
 
30.230.2%
 
15.9530.2%
 
121.0930.2%
 
27.1130.2%
 
30.7630.2%
 
107.5730.2%
 
Other values (1705)194798.3%
 
ValueCountFrequency (%) 
040.2%
 
0.110.1%
 
0.3610.1%
 
0.6920.1%
 
0.7610.1%
 
ValueCountFrequency (%) 
346.3810.1%
 
331.1810.1%
 
326.0210.1%
 
318.5910.1%
 
303.8810.1%
 
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.5 KiB
0:Not Recurred
1177 
1:Recurred
803 
ValueCountFrequency (%) 
0:Not Recurred117759.4%
 
1:Recurred80340.6%
 
2021-01-27T16:53:11.009764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:11.117366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:11.250724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length14
Median length14
Mean length12.37777778
Min length10

Sex
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.5 KiB
Female
1980 
ValueCountFrequency (%) 
Female1980100.0%
 
2021-01-27T16:53:11.416097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:11.511096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:11.607733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length6
Mean length6
Min length6

3-Gene classifier subtype
Categorical

MISSING

Distinct4
Distinct (%)0.2%
Missing217
Missing (%)11.0%
Memory size15.5 KiB
ER+/HER2- Low Prolif
640 
ER+/HER2- High Prolif
616 
ER-/HER2-
309 
HER2+
198 
ValueCountFrequency (%) 
ER+/HER2- Low Prolif64032.3%
 
ER+/HER2- High Prolif61631.1%
 
ER-/HER2-30915.6%
 
HER2+19810.0%
 
(Missing)21711.0%
 
2021-01-27T16:53:11.758713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:11.863914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:12.019744image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length21
Median length20
Mean length15.23131313
Min length3

Tumor Size
Real number (ℝ≥0)

MISSING

Distinct112
Distinct (%)5.7%
Missing26
Missing (%)1.3%
Infinite0
Infinite (%)0.0%
Mean26.28135107
Minimum1
Maximum182
Zeros0
Zeros (%)0.0%
Memory size15.5 KiB
2021-01-27T16:53:12.177792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11
Q117
median23
Q330
95-th percentile52
Maximum182
Range181
Interquartile range (IQR)13

Descriptive statistics

Standard deviation15.38167819
Coefficient of variation (CV)0.5852696897
Kurtosis19.97341057
Mean26.28135107
Median Absolute Deviation (MAD)7
Skewness3.215335199
Sum51353.76
Variance236.5960239
MonotocityNot monotonic
2021-01-27T16:53:12.339790image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2023611.9%
 
251718.6%
 
301567.9%
 
151527.7%
 
18773.9%
 
35743.7%
 
40683.4%
 
16663.3%
 
22633.2%
 
17572.9%
 
Other values (102)83442.1%
 
ValueCountFrequency (%) 
190.5%
 
240.2%
 
2.1210.1%
 
2.310.1%
 
340.2%
 
ValueCountFrequency (%) 
18210.1%
 
18010.1%
 
16010.1%
 
15010.1%
 
13020.1%
 

Tumor Stage
Real number (ℝ≥0)

MISSING

Distinct5
Distinct (%)0.3%
Missing515
Missing (%)26.0%
Infinite0
Infinite (%)0.0%
Mean1.736518771
Minimum0
Maximum4
Zeros12
Zeros (%)0.6%
Memory size15.5 KiB
2021-01-27T16:53:12.494923image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum4
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.6424723026
Coefficient of variation (CV)0.3699771711
Kurtosis0.3085681934
Mean1.736518771
Median Absolute Deviation (MAD)0
Skewness0.274582034
Sum2544
Variance0.4127706597
MonotocityNot monotonic
2021-01-27T16:53:12.626681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
282541.7%
 
150025.3%
 
31186.0%
 
0120.6%
 
4100.5%
 
(Missing)51526.0%
 
ValueCountFrequency (%) 
0120.6%
 
150025.3%
 
282541.7%
 
31186.0%
 
4100.5%
 
ValueCountFrequency (%) 
4100.5%
 
31186.0%
 
282541.7%
 
150025.3%
 
0120.6%
 

Patient's Vital Status
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.5 KiB
Living
837 
Died of Disease
646 
Died of Other Causes
497 
ValueCountFrequency (%) 
Living83742.3%
 
Died of Disease64632.6%
 
Died of Other Causes49725.1%
 
2021-01-27T16:53:12.807124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-01-27T16:53:12.936829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:13.103303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length20
Median length15
Mean length12.45050505
Min length6

Interactions

2021-01-27T16:52:40.150720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:40.351677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:40.536668image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:40.709337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:40.870564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:41.045715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:41.255594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:41.430618image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:41.599340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:41.825633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:42.025649image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:42.219177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:42.387851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:42.549606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:42.719879image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:42.947601image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:43.131660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:43.325517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:43.518586image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:43.804830image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:43.971743image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:44.117821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:44.265785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:44.453151image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:44.647149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:44.832152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:45.001721image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:45.169725image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:45.378817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:45.560813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:45.714813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:45.867332image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:46.072670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:46.228495image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:46.392599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:46.561597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:46.754532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:46.920535image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:47.100635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:47.253591image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:47.405563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:47.560697image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:47.764217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:47.968617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:48.159542image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:48.339543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:48.513652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:48.724650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:48.910647image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:49.091658image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:49.260564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:49.444423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:49.791409image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:49.994687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:50.185508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:50.376735image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:50.573625image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:50.771638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:50.938648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:51.130125image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:51.336655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:51.577170image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:51.770244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:51.972816image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:52.274945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:52.598522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:52.803667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:52.966228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:53.143857image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:53.335805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:53.547823image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:53.739636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:53.954786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:54.145281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:54.382263image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:54.599421image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:54.773738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:54.946041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:55.151149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:55.343726image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:55.589223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-01-27T16:53:13.245818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-01-27T16:53:13.956915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-01-27T16:53:14.294760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-01-27T16:53:14.671838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-01-27T16:53:15.324760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-01-27T16:52:56.019657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:58.547812image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:52:59.536496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-27T16:53:00.464692image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

df_indexAge at DiagnosisType of Breast SurgeryCancer Type DetailedCellularityChemotherapyPam50 + Claudin-low subtypeER status measured by IHCER StatusNeoplasm Histologic GradeHER2 status measured by SNP6HER2 StatusTumor Other Histologic SubtypeHormone TherapyInferred Menopausal StateIntegrative ClusterPrimary Tumor LateralityLymph nodes examined positiveMutation CountNottingham prognostic indexOncotree CodeOverall Survival (Months)Overall Survival StatusPR StatusRadio TherapyRelapse Free Status (Months)Relapse Free StatusSex3-Gene classifier subtypeTumor SizeTumor StagePatient's Vital Status
0075.65MASTECTOMYBreast Invasive Ductal CarcinomaNaNNOclaudin-lowPositvePositive3.0NEUTRALNegativeDuctal/NSTYESPost4ER+Right10.0NaN6.044IDC140.5000000:LIVINGNegativeYES138.650:Not RecurredFemaleER-/HER2-22.02.0Living
1143.19BREAST CONSERVINGBreast Invasive Ductal CarcinomaHighNOLumAPositvePositive3.0NEUTRALNegativeDuctal/NSTYESPre4ER+Right0.02.04.020IDC84.6333330:LIVINGPositiveYES83.520:Not RecurredFemaleER+/HER2- High Prolif10.01.0Living
2248.87MASTECTOMYBreast Invasive Ductal CarcinomaHighYESLumBPositvePositive2.0NEUTRALNegativeDuctal/NSTYESPre3Right1.02.04.030IDC163.7000001:DECEASEDPositiveNO151.281:RecurredFemaleNaN15.02.0Died of Disease
3347.68MASTECTOMYBreast Mixed Ductal and Lobular CarcinomaModerateYESLumBPositvePositive2.0NEUTRALNegativeMixedYESPre9Right3.01.04.050MDLC164.9333330:LIVINGPositiveYES162.760:Not RecurredFemaleNaN25.02.0Living
4476.97MASTECTOMYBreast Mixed Ductal and Lobular CarcinomaHighYESLumBPositvePositive3.0NEUTRALNegativeMixedYESPost9Right8.02.06.080MDLC41.3666671:DECEASEDPositiveYES18.551:RecurredFemaleER+/HER2- High Prolif40.02.0Died of Disease
5578.77MASTECTOMYBreast Invasive Ductal CarcinomaModerateNOLumBPositvePositive3.0NEUTRALNegativeDuctal/NSTYESPost7Left0.04.04.062IDC7.8000001:DECEASEDPositiveYES2.891:RecurredFemaleER+/HER2- High Prolif31.04.0Died of Disease
6656.45BREAST CONSERVINGBreast Invasive Ductal CarcinomaModerateYESLumBPositvePositive2.0LOSSNegativeDuctal/NSTYESPost3Right1.04.04.020IDC164.3333330:LIVINGPositiveYES162.170:Not RecurredFemaleNaN10.02.0Living
7770.00MASTECTOMYBreast Invasive Lobular CarcinomaHighYESNormalNegativeNegative3.0NEUTRALNegativeLobularNOPost4ER-LeftNaNNaN6.130ILC22.4000001:DECEASEDNegativeYES11.741:RecurredFemaleER-/HER2-65.03.0Died of Disease
8889.08BREAST CONSERVINGBreast Mixed Ductal and Lobular CarcinomaModerateNOclaudin-lowPositvePositive2.0NEUTRALNegativeMixedYESPost3Left1.01.04.058MDLC99.5333331:DECEASEDNegativeYES98.220:Not RecurredFemaleNaN29.02.0Died of Other Causes
91086.41BREAST CONSERVINGBreast Invasive Ductal CarcinomaModerateNOLumBPositvePositive3.0GAINNegativeDuctal/NSTYESPost9Right1.04.05.032IDC36.5666671:DECEASEDNegativeYES36.090:Not RecurredFemaleER+/HER2- High Prolif16.02.0Died of Other Causes

Last rows

df_indexAge at DiagnosisType of Breast SurgeryCancer Type DetailedCellularityChemotherapyPam50 + Claudin-low subtypeER status measured by IHCER StatusNeoplasm Histologic GradeHER2 status measured by SNP6HER2 StatusTumor Other Histologic SubtypeHormone TherapyInferred Menopausal StateIntegrative ClusterPrimary Tumor LateralityLymph nodes examined positiveMutation CountNottingham prognostic indexOncotree CodeOverall Survival (Months)Overall Survival StatusPR StatusRadio TherapyRelapse Free Status (Months)Relapse Free StatusSex3-Gene classifier subtypeTumor SizeTumor StagePatient's Vital Status
1970197551.87MASTECTOMYBreast Invasive Ductal CarcinomaModerateNONormalPositvePositive2.0GAINNegativeDuctal/NSTYESPost4ER+NaN5.05.05.13IDC126.6666671:DECEASEDPositiveNO53.291:RecurredFemaleER+/HER2- Low Prolif65.0NaNDied of Disease
1971197653.87MASTECTOMYBreast Invasive Ductal CarcinomaHighNOclaudin-lowPositveNegative3.0GAINPositiveDuctal/NSTYESPost1NaN22.04.06.10IDC6.8333331:DECEASEDNegativeNO6.181:RecurredFemaleHER2+50.0NaNDied of Disease
1972197752.90BREAST CONSERVINGBreast Invasive Ductal CarcinomaHighNOLumBPositvePositive2.0GAINNegativeDuctal/NSTYESPost6NaN1.03.04.03IDC78.4666671:DECEASEDPositiveYES32.761:RecurredFemaleER+/HER2- High Prolif15.0NaNDied of Disease
1973197856.90MASTECTOMYBreast Invasive Ductal CarcinomaHighNOLumAPositvePositive3.0NEUTRALNegativeDuctal/NSTYESPost3NaN1.05.05.09IDC199.2333330:LIVINGPositiveNO196.610:Not RecurredFemaleER+/HER2- Low Prolif45.0NaNLiving
1974197959.20MASTECTOMYBreast Invasive Ductal CarcinomaHighNOLumBPositvePositive2.0GAINNegativeDuctal/NSTYESPost1NaN1.02.04.03IDC82.7333331:DECEASEDPositiveNO81.641:RecurredFemaleER+/HER2- High Prolif15.0NaNDied of Disease
1975198043.10BREAST CONSERVINGBreast Invasive Lobular CarcinomaHighNOLumAPositvePositive3.0NEUTRALNegativeLobularYESPre3Right1.04.05.05ILC196.8666670:LIVINGPositiveYES194.280:Not RecurredFemaleER+/HER2- Low Prolif25.0NaNLiving
1976198142.88MASTECTOMYBreast Invasive Ductal CarcinomaHighNOLumBPositvePositive3.0GAINPositiveDuctal/NSTNOPre5NaN1.06.05.04IDC44.7333331:DECEASEDNegativeYES16.091:RecurredFemaleNaN20.0NaNDied of Disease
1977198262.90MASTECTOMYBreast Invasive Ductal CarcinomaHighNOLumBPositvePositive3.0NEUTRALNegativeDuctal/NSTYESPost1Left45.04.06.05IDC175.9666671:DECEASEDPositiveYES121.181:RecurredFemaleNaN25.0NaNDied of Disease
1978198361.16MASTECTOMYBreast Invasive Ductal CarcinomaModerateNOLumBPositvePositive2.0NEUTRALNegativeDuctal/NSTYESPost1NaN12.015.05.05IDC86.2333331:DECEASEDPositiveNO85.100:Not RecurredFemaleER+/HER2- High Prolif25.0NaNDied of Other Causes
1979198460.02BREAST CONSERVINGBreast Invasive Ductal CarcinomaHighNOLumBPositvePositive3.0NEUTRALNegativeDuctal/NSTYESPost10NaN1.03.05.04IDC201.9000001:DECEASEDNegativeYES199.240:Not RecurredFemaleER+/HER2- High Prolif20.0NaNDied of Other Causes